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Abstract 

Agriculture is critical to India’s socioeconomic system. Agriculture is one of 
the most important industries in the Indian economy, accounting for more than 
18% of GDP. Almost 58% of India’s population relies largely on agriculture for 
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soil degradation; 


Decision tree; 


K rds: ie ; , , ie ; 5 : 
events a living, making India a prominent participant in the global agriculture busi- 
agriculture; ; 5 
GpP: ness. Farmers plant the same crop every season rather than farming various 
fertilizers sorts in different seasons. They also utilize extra fertilizers without understand- 
geographics; ing their exact composition or dose. Giving farmers timely access to insight- 
weather; ful information would allow them to apply best practices and manage their 

technology; 


property more effectively, reducing losses and increasing revenues. The pro- 
posed method assists farmers in selecting the best crop for their requirements 
while accounting for all aspects such as sowing season, soil, geographic loca- 


Random forest; 
Naive bayes; tion, and the best fertilizer to seed based on soil and weather conditions. This 
Knn 


improves agricultural productivity and revenues. As a consequence, farmers 
may use our technology to produce fresh crops throughout the year at a better 
profit while reducing soil deterioration. This is possible because to the use of 
several machine learning algorithms. This strategy is implemented utilizing 
machine learning (ML) algorithms such as Decision tree, Random forest, Nave 


bayes, and KNN. 


1. Introduction 


Agriculture provides a living for about 58% of 
the people in our country. Weather, environmen- 
tal changes, rainfall, and fertiliser application all 
have an impact on crop yield. As a result, farm- 
ers are unable to achieve the expected agricultural 
yield (Priyadharshini et al.). Taking into consid- 
eration significant environmental variables, geolog- 
ical location, and soil composition, our proposed 
technique leverages machine learning to aid farm- 
ers by selecting the best crop for their land and, 
based on those recommendations, the appropriate 
fertiliser for their field. (Varsha) Using technology 


OPEN ACCESS 


to maximise profit, enhance crop quality, and sig- 
nificantly boost output. Farmers may benefit from 
having a successful production with minimal fail- 
ures. (Viswanathan et al.) This study employs a 
variety of machine learning approaches, including 
Random Forest (Shaurya, Aishwarya, and Rohilla). 
technology to optimize profit, improve crop quality, 
and bring about major production increases. Hav- 
ing a successful output with few failures might be 
advantageous to farmers. Many machine learning 
techniques, including Random Forest, are used in 
this work. (Wickramasinghe et al.)This is a real 
world agricultural problem that is solved by using 
ML. Our project can help farmers in following ways: 
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e Food Security: Precise crop prediction can 
contribute to food security by helping farmers to 
plan their crops ahead of time and change plant- 
ing and harvesting dates depending on projected 
yields. This contributes to a consistent supply 
of food and reduces the possibility of food short- 
ages. (Jeevaganesh, Harish, and B. Priya) 

e Economic Development: Crop prediction can 
also aid economic development by assisting farmers 
in making more educated decisions regarding when 
to plant, when to harvest, and what crops to culti- 
vate. This can result in larger yields, higher qual- 
ity crops, and, eventually, higher revenues for farm- 
ers. (K. D. Priya et al.) 

e Environmental Sustainability: Fertilizer fore- 
casting can also aid in the promotion of sustainable 
agricultural practices by limiting fertilizer runoff 
into waterways and lowering the danger of soil 
damage. (Jadhav et al.) This can assist to maintain 
ecosystems and enhance agricultural long-term via- 
bility. (Bhansali et al.) 

e Resource Efficiency: Fertilizer prediction can 
assist farmers in making better use of resources 
by lowering the quantity of fertilizer required to 
achieve targeted crop yields. This can assist to cut 
production costs while also reducing agriculture’s 
environmental effect. (Premasudha, T. D. K, and T. 
K) 

e Increased Agricultural Productivity: Precise 
fertilizer forecast can assist farmers in applying the 
appropriate quantity of nutrients to their crops at the 
appropriate time. This can result in higher yields, 
better crop quality, and, ultimately, higher produc- 
tion. (Sivanandam et al.) 


1.1. Problem Definition 


Crop forecasting assists farmers in selecting the best 
crop to produce in order to maximise productivity 
and, hence, profit. We typically use the concept of 
fertiliser advice since there have been several cases 
of crop failure in the past owing to a lack of knowl- 
edge about proper fertilisers and pesticides. This 
might be highly beneficial in providing a good out- 
put and resolving this problem. Our system cap- 
tures and mitigates escalating risk by steering farm- 
ers towards optimal output and profit maximisation. 


1,2. Scope 


In terms of societal impact, digital agriculture tech- 
nology will be a game changer. Farmers will profit 
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the most since they will incur less losses while pro- 
ducing and earning more. Furthermore, this tech- 
nology provides farmers with better fertiliser guid- 
ance. It assists farmers in selecting the crop that will 
thrive best in the location and at the appropriate time 
of sowing. As a result, it will have a huge impact on 
society or the environment. 


2. Methodology 


We are creating a web application that will serve as 
a user interface for this project. This gives farm- 
ers access to the project’s fundamental implementa- 
tion component. The application we are developing 
is user-friendly, taking user input and passing it via 
the backend to produce projected results in the inter- 
face. Based on soil composition and environmental 
characteristics such as temperature, humidity, soil 
ph, and rainfall, the suggested method would esti- 
mate the best crop and fertiliser for a specific plot of 
land. 


Dataset from 
kagale 


SS ‘ 
Machine 
Data 


Crop data > Data Pearce Leaning 
collection aa Algorithm 


pl op nH 
Predict Predict 


iaaan Suitable Crop Suitable 
- j Fertilizer 


Fertilizer data 


FIGURE 1. Overall Architecture 


2.1. Data Collection 


One of the most important processes in machine 
learning is data acquisition. We obtained the crop 
dataset and the fertiliser dataset from the Kag- 
gle.com website. The following characteristics are 
included in this dataset: Soil PH, temperature, 
humidity, rainfall, NPK levels, crop type, and fer- 
tiliser names are all variables to consider. The crop 
dataset has 2200 rows and 8 columns, whereas the 
fertiliser dataset has 205 rows and 9 columns. 


2.2. Data Preprocessing 


Following the collection of the dataset. Before feed- 
ing the dataset into a machine learning model, it 
must be pre-processed. Data preparation can be 
done in steps, which involve reading the obtained 
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data, cleaning it, dealing with null and duplicated 
entries, and removing undesirable properties. The 
primary goal of data preparation is to increase model 
performance. Following this phase, the dataset will 
be divided into training and test data. 


2.3. Machine Learning Algorithms 


Machine learning employs a variety of algorithms 
to construct mathematical models and make pre- 
dictions based on past data or information. It is a 
method for predicting future events using past data. 
In our case, we choose a supervised machine learn- 
ing technique since classification algorithms are best 
suited to it. We employed machine learning methods 
such as the Decision Tree Classifier, KNN Classi- 
fier, and Naive Bayes Classifier to forecast accept- 
able crop. We employed the Decision Tree Classi- 
fier, Random Forest Classifier, KNN Classifier, and 
SVM Classifier to forecast fertiliser. 


2.4. Crop Prediction 


During this stage, we predict which crop will be 
grown in the soil. It accepts factors such as NPK 
content, temperature, humidity, ph, and rainfall as 
input. This procedure begins with the import of 
the dataset, followed by feature selection and data 
cleaning. Following that, data is visualized using 
Python tools such as matplotlib and seaborn. Fol- 
lowing the completion of all pre-processing pro- 
cesses, we train the model using machine learning 
methods such as decision trees, KNN, and naive 
Bayes. The model is then fed the input to anticipate 
the optimal harvest. 


2.5. Fertilizer Prediction 


We are guessing the optimal fertilizer for the soil at 
this stage. It accepts factors like NPK concentra- 
tion, temperature, humidity, moisture, soil type, and 
crop kind as inputs. The method begins with import- 
ing the dataset, followed by feature selection and 
data cleaning, and then, the categorical column crop 
type in the dataset is transformed to an integer using 
a label encoder. Following that, data is visualized 
using Python tools such as matplotlib and seaborn. 
Following the completion of all pre-processing pro- 
cesses, we train the model using machine learning 
techniques such as the random forest, KNN, and 
decision tree. The model is then fed the input to 
estimate the optimal fertilizer. 


2023, Vol. 05, Issue 05S May 


3. Results and Discussion 


The Results show that Gaussian Naive Bayes algo- 
rithm is best suited for crop recommendation system 
with the accuracy of 99.54%. And Random Forest 
algorithm is best suited for fertilizer recommenda- 
tion system with accuracy of 100%. 
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FIGURE 2. Count plot 


The value counts visualization for unique crop 
type in target column of the dataset. Performed 
using seaborn library. 
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FIGURE 3. Nitrogen requirement 


Figure 3 shows the requirements of nitrogen for 
various crops. 

The heatmap is draw to show the relationship 
between two variables, one plotted on each axis. 
using this we can easily observe if there are any pat- 
terns in value for one or both variables. 
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FIGURE 4. Heat map 


precision recall f1-score support 

apple 1.00 1.00 1.00 28 
banana 1.00 1.00 1.00 29 
blackgram 1.00 1.00 1.00 35 
chickpea 1.00 1.00 1.00 30 
coconut 1.00 1.00 1.00 33 
coffee 1.00 1.00 1.00 37 
cotton 1.00 1.00 1.00 36 
grapes 1.00 1.00 1.00 23 

jute @.96 @.92 @.94 26 
kidneybeans 1.00 1.00 1.00 33 
lentil 1.00 1.00 1.00 42 
maize 1.00 1.00 1.00 30 
mango 1.00 1.00 1.00 29 
mothbeans 1.00 1.00 1.00 29 
mungbean 1.00 1.00 1.00 26 
muskmelon 1.00 1.00 1.00 28 
orange 1.00 1.00 1.00 33 
papaya 1.00 1.00 1.00 26 
pigeonpeas 1.00 1.00 1.00 23 
pomegranate 1.00 1.00 1.00 28 
rice @.94 @.97 @.96 34 
watermelon 1.00 1.00 1.00 22 
accuracy 1.00 660 
macro avg 1.00 1.00 1.00 660 
weighted avg 1.00 1.00 1.00 660 


FIGURE 5. Classification report 


Final Classification report for the crop prediction 
model using a Gaussian Naive Bayes algorithm with 
an accuracy of 99.54%. 

Above plot used to check for outliers in various 
temperature values in dataset. A boxplot displays 
the distribution of a data using quartiles. The box 
represents the middle 50% of the data, with the 
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Checking Outliers 


26 28 30 32 4 36 38 
Temparature 


FIGURE 6. Boxplot 


median value (the value that separates the top 50% 
from the bottom 50%) shown as a horizontal line 
within the box. 


Relation with output variable 


Urea DAP 14-35-14 28-28 17-17-17 20-20 10-26-26 
Fertilizer 


FIGURE 7. Boxplot for two columns 


The above box plot represents the variations of 
temperature for each fertilizer type. 

Label encoding is done for crop type(string) col- 
umn. label encoding is a technique used to convert 
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Encoded 
Original 
Barley 0 
Cotton 1 
Ground Nuts 
Maize 
Millets 
Oil seeds 
Paddy 


Pulses 


o nN OO oO ee HULU 


Sugarcane 


FIGURE 8. Label encoding 


categorical variables into numerical variables that 
can be used in a machine learning algorithm. 


precision recall f1-score support 

10-26-26 1.00 1.00 1.00 1 
14-35-14 1.00 1.00 1.00 4 
ag=1 7-47 1.00 1.00 1.00 5 
20-20 1.00 1.00 1.00 5 
28-28 1.00 1.00 1.00 9 

DAP 1.00 1.00 1.00 7 

Urea 1.00 1.00 1.00 10 
accuracy 1.00 41 
macro avg 1.00 1.00 1.00 41 
weighted avg 1.00 1.00 1.00 41 


FIGURE 9. Classification report 


Final classification report for fertilizer prediction 
model using Random forest classifier algorithm with 
an accuracy of 100%. 

The algorithms which are used for the crop rec- 
ommendation are Decision Trees , KNN and Gaus- 
sian Naive Bayes with accuracy of 98.63%, 98.33%, 
99.54% respectively. For a fertilizer recommenda- 
tion we used Decision Trees , KNN and Random 
Forest with accuracy of 97.56%, 95% and 100% 
respectively. 


4. Conclusion 


The suggested approach assists farmers in select- 
ing the appropriate crop and fertilizer by deliver- 
ing insights that regular farmers must keep track of, 
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reducing crop failure and boosting output. It also 
keeps them from losing money. Together with these 
projects, we want to integrate leaf disease detection 
in the future to benefit farmers. 
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